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Abstract 

We describe a quantum black-box network computing the majority of N bits with zero- 
sided error e using only |-/V + 0(\/iV log(e~i log A^)) queries: the algorithm returns the correct 
answer with probability at least 1 — e, and "I don't know" otherwise. Our algorithm is given 
as a randomized "XOR decision tree" for which the number of queries on any input is strongly 
concentrated around a value of at most |7V. We provide a nearly matching lower bound of 

— 0{^fN) on the expected number of queries on a worst-case input in the randomized XOR 
decision tree model with zero-sided error o(l). Any classical randomized decision tree computing 
the majority on N bits with zero-sided error i has cost N . 

1 Introduction 

How do you tell how a committee of three people will vote on an issue? The obvious approach is 
to ask each individual what vote he or she is planning to cast. If the first two committee members 
agree, you can skip the third one, but, if they disagree, you need to talk to all three members. 

Suppose, however, that you can perform quantum tranformations on the committee members. 
This allows you to ask, with one quantum question, whether the first two members agree or disagree. 
If they agree, you can disregard the third member and ask one of the first two for her vote. If the 
first two disagree, you know their votes will cancel, so it suffices to ask the third member for his 
vote. Either way, you will learn the answer in only two queries. 

In this paper, we discuss generalizations of this procedure to arbitrarily many voters. We 
allow our algorithms to ask whether two voters agree at the cost of one query. We consider both 
deterministic and randomized algorithms, allowing different kinds of error. Our algorithms can 
be simulated very efficiently on quantum machines, yielding new upper bounds for the quantum 
complexity of the MAJORITY function. 



1.1 Overview 

Suppose we wish to compute the value f{X) of a function / on {0, 1}'^ where the input X is given 
to us as a black-box X : {0, . . . , A'^ — 1} ^ {0, 1}. The cost of the computation will be the number 
of queries we make to the oracle X. In the classical case, this model of computation is known as a 
decision tree, and has been well-studied. 
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More recently, a quantum mechanical version of the model has been considered, which is inher- 
ently probabilistic. Several complexity measures are investigated: the number of queries needed to 
compute / exactly, with zero-sided error e, or with bounded error e. Beals et al. [3] show that for 
any function / these measures are all polynomially related to the classical decision tree complexity. 
Beals et al. also look more closely at some specific functions /. In particular, they consider the 
majority function, whose decision tree complexity equals N. They prove that in the quantum model 
the exact and zero-sided error cost functions are between N/2 and (for any e < 1); a result of 
Paturi's [10] implies that the bounded error cost function is Q{N) (for any constant e < ^). 

In this paper, we investigate these cost measures for MAJORITY more closely. We provide 
improved upper bounds, as well as matching lower bounds in related models. 

Our first result is a quantum black-box network which exactly computes MAJORITY using 
N + 1 — w{N) queries, where w{N) equals the number of ones in the binary expansion of N. So, 
for N of the form 2" — 1, we can save [logA^J queries. 

Our algorithm exploits the fact, due to Cleve et al. [5], that the XOR of two input bits can be 
determined in a single quantum query. In fact, our algorithm can be viewed as an XOR decision 
tree, i.e., a classical decision tree with the additional power of computing the XOR of two input 
bits at the cost of a single query. The complexity of MAJORITY in this model has been studied 
before [12, 1, 2], independently of the connection with quantum computation. A tight bound of 
N + 1 — w{N) was known [12, 1]. We give a simpler proof for the lower bound which generalizes 
to the case where computing the parity of arbitrarily many input bits is permitted in one query. 
The lower bound shows that our procedure cannot be improved without at least introducing a new 
quantum trick. 

Our main result is a quantum black-box network that computes MAJORITY with zero-sided 
error e using only + 0{y/N log{e~^ log A^)) queries. For any positive e we construct such a 
network. The algorithm can be viewed as a randomized variant of an XOR decision tree given by 
Alonso et al. [2]. We construct an exact randomized XOR decision tree with an expected number 
of queries of at most | A'^ -|- 2 log A'^ on any input. We argue that the number of queries is sufficiently 
concentrated to yield our main result. 

Alonso et al. [2] show that the average cost of their algorithm over all AT-bit inputs is — 
r2(\/iV). They also show that the average-case complexity of MAJORITY in the XOR decision 
tree model is at least |A^ — 0{^/N). We instead are interested in the cost of randomized XOR 
decision trees on worst-case inputs. A standard argument shows that the Alonso et al. lower bound 
also holds for the expected number of queries on a worst-case input. We also prove that classical 
randomized decision trees need N queries to compute MAJORITY with zero-sided error i. 

In the general bounded-error setting. Van Dam [13] has shown how to compute any function 

/ using + yiVlogF^ quantum queries. We point out that Van Dam's technique does not 

provide a zero-sided error network for MAJORITY of cost less than N. We prove that any classical 

randomized decision tree for MAJORITY has to have cost N to achieve bounded error of at most 
1 

4- 

1.2 Organization 

Section 2 provides some preliminaries, including background on the XOR decision tree model, the 
quantum black-box model, and their relationship. Section 3 describes and analyzes our quantum 
network for computing MAJORITY exactly using A^ -|- 1 — w{N) queries. In Section 4, we discuss 
our randomized XOR decision tree for MAJORITY that has small zero-sided error and cost about 
IA'', and we relate this to the zero-error quantum query complexity. In Section 5, we show that 
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the exact algorithm of Section 3 is optimal in a generalized version of the XOR decision tree 
model. In Section 6, we discuss lower bounds for the cost of randomized XOR decision trees and 
classical randomized decision trees for computing MAJORITY. Finally, in Section 7, we give a 
table summarizing the known results and propose several questions for further research. 



2 Preliminaries 

We first introduce some general notation. Then we discuss XOR decision trees, quantum black-box 
networks, and their relationship. 

Let X = XqXi . . . Xjv_i be a Boolean string of length A^. We will often think of X as a function 
X : {0,1,... ,N - 1} ^ {0, 1}. We define MAJORITY(X) to be if X contains more zeros than 
ones, and 1 otherwise. This is a weak definition, which we will use to establish our lower bounds. 
Our algorithms will always yield a stronger result in that they will answer "tie" when the number 
of zeros and ones are equal. The discrepancy of X is the size of the majority, i.e., the absolute value 
of the difference in the number of zeros and ones. XOR denotes the exclusive OR of two bits, and 
PARITY (X) denotes ^ Xi mod 2. 

For a positive integer N, the Hamming weight of N, denoted w{N), is the number of ones in 
the standard binary representation for N. We will use the following properties. 

Lemma 1 For any integer N > 0, Y^kLi [N/2''\ = N - w{N). 

Proof. Let i = [log N\ , and write N = Ylj=o bj2^ , where bj G {0, 1}. We then have: 

oo oo i t j (■ II 

E = E E ^fl'-' = E E 2^-' = E ^.(2^' - 1) = E ^.2^' - E 

fc=l k=lj=k j=l k=l j=l j=0 0=0 

which is simply N — w{N). □ 



Corollary 2 For any integer N > 0, N\ is exactly divisible by 2 ' . 

Proof. For any positive integer k, there are exactly [iV/2'^J multiples of 2*^ contributing to AT!. 
So the exponent of the largest power of 2 dividing N\ is given by J2T=i [N/2''\ , which is equal to 
— w{N) by Lemma 1. □ 



2.1 XOR decision trees 

An XOR decision tree is an algorithm for a given input length N which adaptively queries the input 
X and outputs a value. A query may be either: 

• Xi, where < i < N — I, or 

• Xi® Xj, where < i,i < A^ - 1 and denotes XOR. 

The cost on a given input X is the number of queries made. The cost of an XOR decision tree is 
the maximum cost over all inputs of length N. An XOR decision tree can be viewed as a binary 
tree. The depth of this tree equals the cost of the XOR decision tree. We refer to Section 5.1 for a 
further generalization of XOR decision trees. 
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We define a randomized XOR decision tree T as an XOR decision tree in which we can toss a 
coin with arbitrary bias at any point in time, and proceed based on the outcome of the coin toss. 
Equivalently, we can view T as a probabihty distribution over (deterministic) XOR decision trees. 
The number of queries on a given input X is a random variable. We define the cost on input X as 
the maximum of this random variable, and the cost of T as the maximum cost over all inputs X. 

The following definitions applies to a randomized decision tree T on A^-bit inputs, and more 
generally to any probabilistic process T that takes a Boolean string of length N as input and outputs 
a value. Let / be a function on {0, 1}^. If on any input X, T outputs f{X) with probability at 
least 1 — e, we say that T compTitcs / with error e. If T outputs f{X) with probability at least 
1 — e and says "I don't know" otherwise (i.e., T never produces an incorrect output) we say that 
T computes / with zero-sided error e. In the case where e = 0, we say that T exactly computes /. 

A randomized decision tree that exactly computes / at cost C can trivially be transformed 
into a deterministic tree computing / at the same cost. It can often also be transformed into a 
randomized XOR decision tree for / with zero-sided error e and cost C < C, e.g., if on any inpiit 
the number of queries is strongly concentrated around a value less than C' . More precisely, suppose 
that on any input X, with probability at least 1 — e, T makes no more than C queries. Then we 
can run T but as soon as we attempt to make more than C' queries, stop the process and output 
"I don't know." The modified randomized decision tree has zero-sided error at most e and cost at 
most C. 

2.2 Quantum black-box networks 

A quantum computer performs a sequence of unitary transformations Ui,U2, ■■■ ,Ut on a complex 
Hilbert space, called the state space. The state space has a canonical orthonormal basis which is 
indexed by the configurations s of some classical computer M. The basis state corresponding to s 
is denoted by \s). 

The initial state 0o is a basis state. At any point in time t, 1 < t < T, the state (f)t is obtained 
by applying Ut to (f>t-i, and can be written as 

s 

where |as,tP = 1- 

At time T, we measure the state 4>t- This is a probabilistic process that produces a basis state, 
where the probability of obtaining state |s) for any s equals lag^rP- The output of the algorithm 
is the observed state \s) or some part of it. 

We define the quantum black-box model following Deutsch and Jozsa [7] . In a quantum black-box 
network A for input length N, the initial state is independent of the input X = XqXi . . . Xjv-i. 
We allow arbitrary unitary transformations independent of X. In addition, we allow A to make 
quantum queries. This is the transformation U taking the basis state \i, b, z) to \i, b © Xi, z), where: 

• i is a binary string of length logiV denoting an index into the input X, 

• b is the contents of the location where the result of the oracle query will be placed, 

• z is a placeholder for the remainder of the state description, 

and comma denotes concatenation. 

We define the cost of A to be the number of times the query transformation U is performed; 
all other transformations are free. 
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The error notions introduced in Section 2.1 for arbitrary probabilistic processes also apply to 
quantum black-box networks. 

2.3 From XOR decision trees to quantum black-box networks 

Bernstein and Vazirani have shown [4] that a quantum computer can efficiently simulate classical 
deterministic and probabilistic computations. It is also known that we can efficiently compose 
quantum algorithms. In terms of quantum black-box networks these results imply that a classical 
randomized decision tree T that uses quantum black-box networks as subroutines can be efficiently 
simulated by a single quantum black-box network. The cost of the simulation will be the sum of the 
cost of T and the costs of the subroutines. Similarly, the error of the simulation will be bounded 
by the sum of the error of T and the errors of the subroutines. The simulation will have zero-sided 
error if all of the components do. 

We will describe our quantum black-box networks for MAJORITY as classical randomized 
decision trees that use the following exact quantum black-box network developed by Cleve et al. [5] 
for computing the XOR of two input bits. 

Lemma 3 (Cleve et al. [5]) There exists a quantum black-box network of unit cost that on input 
two bits Xq and Xi exactly computes their XOR. 

The above argument shows that an XOR decision tree for a function / can be transformed 
into a quantum black-box network for / of the same cost. The transformation works in the exact 
setting, as well as for zero-sided or arbitrary error e. 

3 Computing MAJORITY Exactly 

In the introduction, we discussed how to use an XOR query to determine the MAJORITY of three 
input bits. In this section, wc generalize this idea to an input of arbitrary length. We first describe 
a general approach for constructing XOR decision trees or exact randomized XOR decision trees 
for MAJORITY. We call it the "homogeneous block approach." We use this approach to develop 
the "oblivious-pairing" algorithm, an XOR decision tree that computes MAJORITY exactly on 
A^-bit inputs using at most A^ + 1 — w{N) queries. In Section 5 we will show that this is optimal. 

The oblivious-pairing algorithm was first introduced and analyzed by Saks and Werman [12]. 
It forms a first step towards the zero-sided error randomized XOR decision tree for MAJORITY 
which we will develop in Section 4. 

3.1 The homogeneous block approach 

XOR queries allow us to compare bits of the input X. If the bits differ in value, wc can discard 
them since the two of them together will not affect the majority value. If the bits have the same 
value, we can combine them into a homogeneous block of size 2, i.e., a subset of 2 input bits which 
we know have the same value but we do not know what that value is. More generally, we can 
apply the following operation "COMBINE" to two disjoint nonempty homogeneous blocks R and 
S. Suppose that > \S\. We compare a bit from R with a bit from S. If the bits differ, we 
discard block S completely together with |5| bits from block R. Otherwise, we combine blocks R 
and S into a single homogeneous block of size -|- l^l. 

In the homogeneous block approach, wc keep track of a collection of disjoint nonempty homo- 
geneous blocks with the property that the majority of the bits in the union of the blocks equals 
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MAJORITY(X). We start out with the partition of the input into blocks of size 1, i.e., individual 
bits. Then we use some criterion to decide to which two blocks we apply the operation COMBINE. 
We keep doing so until we end up in a configuration consisting of an empty collection or one in 
which one of the blocks is larger than the union of all other blocks. In the former case, we have a 
tic. In the latter, the largest block determines the majority, and querying any of its bits gives us 
the value of the majority. One of these situations will eventually be reached since the number of 
blocks goes down by 1 or 2 in each step. 

Building a homogeneous block of size k requires only k—1 comparisons between the bits in the 
block. In general, the number of comparisons performed upon reaching a configuration consisting of 
i homogeneous blocks equals N—i—c, where c denotes the number of times two blocks cancelled each 
other out completely. It follows that, compared to the trivial procedure of querying every input bit, 
the homogeneous block approach saves one query for every block in the final configuration except 
the dominating block, and one for every cancellation of equal-sized blocks. 

3.2 The oblivious-pairing algorithm 

In the oblivious-pairing algorithm, we first build homogeneous blocks of size 2 by pairing up the 
initial blocks of size 1, leaving the last block of size 1 untouched when N is odd. Then we build 
blocks of size 4 out of the blocks of size 2, possibly leaving the last block of size 2 untouched, etc. 
In general, during the kth phase of the algorithm, we will pairwise COMBINE the homogeneous 
blocks of size 2^^^ to cither cancel or form homogeneous blocks of size 2^. There will be at most 
one block of size 2^^ ^ left after the end of the fcth phase. 

There can be at most [log N\ phases. Afterwards, either there are no blocks left, in which case 
we have a "tie," or else all remaining blocks have sizes that are different powers of 2. The largest 
block then dominates all the others combined and dictates the majority. 

We provide pseudo-code for the oblivious-pairing algorithm in Figure 1. We keep track of the 
collection of disjoint nonempty homogeneous blocks as a list S = {Sj Yj^^ of subsets of {0, 1, ... , N— 
1} of nonincreasing size. We will always compare two consecutive blocks in the list, say Si and 
S'.j+i, a procedure captured by the subroutine COMBINE. We also use the following notation: If X 
is homogeneous on a subset 5 of {0, 1, . . . , — 1}, we write Xs for the value of any bit Xi, i £ S. 

For any positive integer k, the blocks of size 2'^"^ are pairwise disjoint. We pair them up during 
the A;th phase of the algorithm. It follows that the number of COMBINE operations during the 
A;th phase is bounded from above by [A^/2'^J . Each application of COMBINE involves one XOR. 
Therefore, Lemma 1 gives us an upper bound of N — w(N) on the total number of XORs. There can 
be at most one more query, for a total of N +l — 'w{N). This total is reached, e.g., for homogeneous 
inputs (all zeros or all ones). There are no cancellations on homogeneous inputs, and w{N) is the 
smallest number of power-of-2 blocks that add up to N. We conclude: 

Theorem 4 (Saks-Werman [12]) The oblivious-pairing algorithm for MAJORITY on N -bit in- 
puts has XOR decision tree cost N — w{N). 

Corollary 5 We can compute MAJORITY exactly on N-bit inputs using at most N -\- 1 — w{N) 
quantum black-box queries. 

4 Computing MAJORITY with Zero-Sided Error 

In Section 3, we considered the oblivious-pairing XOR decision tree. We showed that it has a 
cost of N — w{N) -\- 1. We now consider exact randomized XOR decision trees for MAJORITY. 
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input: X = (X,)^„i£ {0,1}^ 
output: MAJORITY(X) 
notation: i = \S\ 

Sj = jth element of 5, 1 < j < ^ 
Xsj = Xi for any i e Sj, 1 < j < i 
subroutine: C0MBINE(5, i, X) 

ifXOR(Xs,,X5,+J = 
then replace Si, 5^+1 in 5 by 5, U 5^+1 
else remove Si, ^j+i from S 

algorithm: 

for A; = l,2,...,[logiVJ 

while 7 = {j 1 1 < i < ^ and = \Sj+i\ = 2^-^} 7^ 
z <— min/ 
C0MBINE(5,z,X) 

if £ = 

then return "tie" 
else return Xs-^ 

Figure 1: The oblivious-pairing algorithm 
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Our main result is the randomized greedy-pairing algorithm, for which the number of queries on 
any input is highly concentrated around a value of about on a worst-case input. Using the 
techniques discussed in Sections 2.1 and 2.3, this gives us a randomized XOR decision tree and a 
quantum black-box network with small zero-sided error of cost about ^N. In Section 6, we will 
give a nearly matching lower bound on the expected number of queries on a worst-case input for 
randomized XOR decision trees with small zero-sided error. 

In Section 4.1, we discuss a simple randomized version of the oblivious-pairing algorithm. We 
carefully analyze the number of queries it makes, as we will need that result later on. In Section 4.2, 
we describe a deterministic algorithm of Alonso, Reingold, and Schott [2], the greedy-pairing algo- 
rithm, for which the average number of queries over all A^-bit inputs is roughly |A^. In Section 4.3, 
we analyze a randomized version of the greedy-pairing algorithm. We prove that the number of 
queries it makes is with high probability not much larger than ^N. 

4.1 The randomized oblivious-pairing algorithm 

The oblivious-pairing algorithm is efficient when we can get pairs of blocks to cancel. Recall that 
the number of XORs made in any homogeneous block algorithm for MAJORITY equals N — i — c, 
where i denotes the number of blocks at the end, and c the number of cancellations of equal-sized 
blocks that occurred. In the oblivious-pairing algorithm, i can be at most logA^, so not much 
savings can be expected from that term. The number of cancellations can be much larger. On the 
input 010101 . . ., all A^/2 pairs of individual bits cancel, and we can declare a tie with only A^/2 
queries. However, even if we know the input is perfectly balanced, there is no guarantee that any 
cancellations occur until the very end. 

One natural approach is to randomly permute the input bits before we begin the algorithm: 
Choose some permutation vr of {0, 1, . . . , A^ — 1} uniformly at random, let X'- = X^(^i), and run the 
oblivious-pairing algorithm on the input X'. The distribution of the number of queries on a given 
input now only depends on the number of ones and the number of zeros it contains. 

Consider the randomized oblivious-pairing algorithm running on a perfectly balanced input of 
length A". We perform A^/2 queries comparing individual bits; we expect roughly half of those to 
cancel, and half to yield homogeneous blocks of size 2. We next pair up the A'/4 blocks of size 2, 
which takes A^/8 queries. Again, we expect roughly half of those queries to cancel, and half to yield 
blocks of size 4. The overall number of queries should then be about 

N N N 2 
T + ¥ + 32 + --- = 3^- 

We prove below that the number of queries the oblivious-pairing algorithm makes on a balanced 
input is indeed highly concentrated around |A^. 

However, consider a homogeneous input. Permuting the input bits has no effect; the input 
remains homogeneous, blocks will never cancel, and the randomized oblivious-pairing algorithm 
still takes A" — w{N) + 1 queries. We will need to do something else to reduce the computation cost 
on such inputs. We return to this question in Section 4.2. 

Before doing so, we prove the following theorem about the number of comparisons the oblivious- 
pairing algorithm makes on input X. We will use the theorem in our analysis of our main result in 
Section 4.3. 

Theorem 6 There exists a constant d such that the following holds. Let Cop{X) denote the 
number of comparisons the oblivious-pairing algorithm makes on input X. Let N > 0, and let 
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A + B = N , A,B>0. Let X be chosen uniformly at random from all strings of A ones and B 
zeros. Then for any r > 1, 



Pr 

X 



Cop{X) > N - - mm{A, B) + dVrN 



< 2"'' log N. 



The proof of Theorem 6 uses the following tail law. 



Lemma 7 There exists a constant d' such that the following holds. Let c{X) denote the number of 
cancellations during the first phase of the oblivious-pairing algorithm on input X. Let N > 0, and 
let A + B = N , A^B Let X be chosen uniformly at random from all strings of A ones and B 
zeros. Then for every r > 1, 



Pr 

X 



\c{X)-AB/N\ > d'VrN 



< Y 



The combinatorial problem underlying Lemma 7 is a special case of "Levene's matching prob- 
lem" [6], and has been well studied. We suspect that the tail law given in Lemma 7 is known but 
have not been able to find a reference. We include a proof in the Appendix. 

Proof of Theorem 6. The proof goes by induction on N. We first do the induction step. 

Assume without loss of generality that A> B. Look at the sequence of homogeneous blocks of 
size 2 after the first phase of oblivious-pairing on input X. Let X' denote the input obtained by 
replacing each block in this sequence by a single bit of the same value. We have that Cop{X) = 

m+cop{x'). 

Let A' denote the number of ones in X\ B' the number of zeros, and N' = A! + B' . Note that 



N' = [ 



c{X), B' 



B-c{X) 
2 



, and A' > B'. 

Conditioned on A' and B' , the distribution of X' is uniform. Therefore, by our induction 
hypothesis, we have that with probability at least 1 — 2"'" log N' 



Cop{X') < 



< 



N' + -B' + dVrW' 



N 



cm -3 



B - c{X) 



+ dVrN' 



— ----c(X)+dV^+-. 
2 3 3 ^^ 3 



By Lemma 7, with probability at least 1 — 2 



B 



c{X) >AB/N- d'VrN >^- d'VrN. 

Taking everything together, and using that fact that A'"' < A^/2, we have that with probability at 
least 1 - 2-^ log AT' - 2"^ > 1 - 2"^ log A, 



Cop{X) < N- 



2 „ ,2d' d , 
3^ + <T+Vl' 



/rN + - 



< N--B + dVrN, 



provided d is large enough that < (1 - ^)d. This proves the induction step. 

By picking d larger as needed, we can take care of the base cases. 



□ 
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Theorem 6 can be strengthened to show that the random variable Cop{X) is strongly concen- 
trated around a value slightly smaller than N — mm{A, B). We omit the precise expression for 
the concentration point, as it is rather cumbersome and not needed for the sequel. A proof similar 
to the above (but simpler and not relying on Lemma 7) shows that the expected value of Cop{X) 
in Theorem 6 is bounded above hy N — ^ min(^, B). 

4.2 The greedy-pairing algorithm 

As we mentioned in Section 4.1, the oblivious-pairing algorithm requires N — w{N) + 1 queries on 
the all ones input, whether or not we randomize. In contrast, the trivial algorithm for MAJORITY, 
which simply queries bits until the observed discrepancy is larger than the number of bits remaining, 
takes [N/2\ + 1 queries on the all ones input. Therefore, we should be able to improve the oblivious- 
pairing algorithm. 

The oblivous-pairing algorithm always COMBINES two smallest blocks of equal size. A first 
idea is that we may decide to always COMBINE two largest blocks of equal size instead, and stop 
as soon as the largest block (if any) is larger than the union of the other blocks. This leads to an 
improvement on some inputs, e.g., on homogeneous inputs of length = 2*^ — 1: we will build up a 
block of size 2^^"^ using [A^/2j XORs and query one bit in that block, for a total cost of [N/2\ + 1. 
However, on homogeneous inputs of length N = 2^ + 1, we still make N —1 queries: we construct a 
block S\ of size 2*^"^, and then perform another 2^~^ — 1 queries to form another large block, even 
though one additional query combining 5*1 with another bit would guarantee a majority. 

In order to do better, we should allow COMBINE operations on blocks of unequal size. As 
cancellations of blocks of equal size are beneficial, we will still prefer to COMBINE such blocks, 
but we should only do so if we reasonably expect the answer to be useful. Alonso, Reingold, and 
Schott [2] introduce a homogeneous block algorithm for MAJORITY which does just this: They 
COMBINE two blocks only if they are sure they will need to know the answer. We call this the 
"greedy-pairing" algorithm. 

More precisely, the greedy-pairing algorithm works as follows. Suppose that in some step we 
find a pair Si, S'j+i of large blocks of equal size. Instead of automatically combining these two 
blocks, however, we now ask a question: Are we sure this is necessary? In other words, if we 
assumed all blocks up to i all agreed, would that still not be enough to determine a majority? If 
the answer is yes, we COMBINE the two blocks. If the answer is no, then we try to build up the 
largest block by running COMBINE on 5*1 and 82- 

When we compare two blocks of the same size, we are trying to gain by cancelling and reducing 
£ by 2 in a single step. When we compare two blocks of different sizes, we are trying to gain by 
greedily constructing a large enough block to guarantee a majority. 

Since the only COMBINE operations between blocks of unequal size involve Si, all blocks except 
possibly Si will have sizes that are powers of 2. Say \Sj\ = 2^^ , 2 < j < £ = \S\, where the s/s 
are integers. The size of Si can be written as l^il = (2m -|- l)2*i for some integers m and si. Note 
that si > S2 > ■ ■ ■ > Si- We will think of as being composed of several power-of-2 blocks. The 
smallest such subblock has size 2*^ . 

The precise criterion we use to determine which blocks Si and S'j+i to compare is given in 
the pseudo-code of Figure 2. Note that the smallest j such that sj = Sj+i exists during each 
execution of the while loop. If there were no such j, the block would dominate all the other 
blocks combined and we would have exited the loop. 

The key to the good performance of the greedy-pairing algorithm is the following observation. 
Let M denote the index of the {[N/2\ + l)st input bit agreeing with the majority. If X is balanced. 
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input: X=(X,),^,^£ {0,1}^ 
output: MAJORITY (X) 
notation: I = \S\ 

Sj = jth element of 5, 1 < j < i 
Xsj = Xi for any i <^ Sj, 1 < j < i 
si = largest integer t such that 2* divides l^il 
Sj = \Sj\, 2<j<£ 
subroutine: C0MBINE(5, i, X) 

ifXOR(X5,,X5,^J=0 
then replace Si, in 5 by 5, U 
else if IS'il > 

then remove l^j+il elements from 5^ 

remove Sj+i from S 
else remove Si, Si+i from S 

algorithm: 

while £ > and < J^j^^ 

i <— smallest integer j such that Sj = Sj+i 

if^j=i\Sj\>j:U+i\Sj\ 

then i 1 

COMBmE{S,i,X) 
ifi = 

then return "tie" 
else return Xs-^ 

Figure 2: The greedy-pairing algorithm 
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let M = N. Let Y denote the substring consisting of the first M bits of X . and Z the remainder of 
X. Then the greedy-pairing algorithm never performs any comparisons involving bits of Z. This 
is because Y forces the majority in all of X, and the greedy-pairing algorithm only involves a new 
bit 6 in a comparison if the bits before b cannot force the majority of X. 

This is the way the greedy-pairing algorithm saves queries compared to the oblivious-pairing 
algorithm: by not making the comparisons the oblivious-pairing algorithm makes involving bits 
of Z. On Y, the greedy-pairing algorithm makes some of the comparisons the oblivious-pairing 
algorithm makes, but possibly also makes some others. We need to show that there aren't too 
many other queries, or at least that we can account for most of them by queries the oblivious- 
pairing algorithm makes on Y but the greedy-pairing algorithm does not. We will prove next that 
there are at most 0(log^ N) queries that we cannot account for in that way. 

Theorem 8 Let Cgp{X) denote the number of comparisons the greedy-pairing algorithm makes 
on input X , and let Cqp be defined as in Theorem 6. There exists a constant d such that on any 
binary input X of length N , 

CGp{X)<Cop{Y) + d\og''N, 

where Y denotes the first M bits of X and M the position of the ([iV/2j -|- l)st bit in X agreeing 
with the majority. When X is balanced, M = N and Y = X. 

In fact, a refinement of the argument below shows that 

CopiY) < Cgp{X) < Cop{Y) + max(2 [logA^J - 3,0), 

which is tight. However, the relationship as stated in Theorem 8 is strong enough for our purposes. 

In order to prove Theorem 8, we need the following properties of the greedy-pairing algorithm. 
They deal with the technical concept of an "unusual comparison," which is a comparison between 
Si and S2 with si 7^ S2- These are precisely the comparisons between blocks of different sizes, 
provided we view a comparison with as one with the last subblock of Si of size 2*^ . 

Lemma 9 Consider running the greedy-pairing algorithm on an input X and call a comparison 
unusual if it is between Si and S2, and si ^ S2- Let s be an integer. Let T be the first point in 
time there is an unusual comparison with S2< s. (If there is no such comparison, we let T denote 
the end of the algorithm.) Then the following hold: 

1. All comparisons the greedy-pairing algorithm makes before T with \Si^i\ < 2* are also made 
by the oblivious-pairing algorithm on input X. 

2. After T, the greedy-pairing algorithm makes no comparisons with l^j+il > 2* and i > 1, and 

none with \Si+i\ > 2* and i = 1. 

3. Let Bj denote the jth block S2 of size 2* which the greedy-pairing algorithm compares with Si 
at and after T. Then the sequence Bi, B2, . . . , Br are successive blocks of size 2* produced by 
the oblivious-pairing algorithm on input X. 

4- The outcome of each of the greedy-pairing comparisons referred to in 3 is "unequal" for 
S2 = Bj, l<j<r. 

Proof of Lemma 9. We prove claim 1 by contradiction. Suppose that, at some time before T, the 
greedy algorithm makes a comparison with ISj+il = 2", where u < s, which is not made by the 
oblivious-pairing algorithm. Consider the first such time U. Since U <T, the comparison at time 
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U is not unusual. Since no unusual comparisons with l^j+il < 2** have occurred, we must have 
|S'j| = 1 5*4+1 1 . And, by our choice of U, both Si and Sj+i are also formed by the oblivious-pairing 
algorithm. 

By our choice of U, any earlier blocks of size 2" must have been compared as they are in the 
oblivious-pairing algorithm. In particular, there must be an even number of them. So, blocks Si and 
Sj+i must be the j'th and (j + l)st blocks of size 2" formed by the oblivious-pairing algorithm for 
some odd j. Hence, this comparison is also made by the oblivious-pairing algorithm, contradicting 
our choice of U. 

We now consider claim 2. Clearly, at time T, only \Si\ can have size larger than 2*, and, if 
■S2 > 0, \Si\ < 1 52 1 for i > 2. (Proof: if 1 5*21 = l^sj, then 5*3 must have been formed at some time 
U < T; since the algorithm did not do an unusual comparison at time U, it would have chosen to 
compare S2 and S3 at time U + 1.) 

At time T, let j be the smallest index such that \Sj\ = |-Sj+i|. Then we must have J2k=i \'^k\ > 
h X^i=i \^k\- Since all blocks up to Sj have different sizes, and |S'2| < 2*, we conclude that, at time 



Once inequality (1) holds, it remains true for the remainder of the algorithm. (No comparison can 
increase the right-hand side. The left side is decreased only by an "unequal" comparison between 

51 and 5*2, in which case both sides decrease by 1521.) 

So, suppose that, at some later time, there is a block of size 2^, and, if a comparison were done 
between some Si and 5^+1, it would form a second such block. By (1), the greedy-pairing algorithm 
would choose to do a comparison between Si and ^2 instead. Hence, from time T onward, there 
can be at most one block Si of size 2'' for z > 1, which proves claim 2. 

To prove claim 3, we let U be the first time (if any) that there is an unusual comparison with 

52 < s. (If S2 < s at time T, then U = T and r = 0.) By claim 2, all blocks Bj are formed before 
time U. So, by claim 1 applied to s — 1, the comparisons which form those blocks are all performed 
by the oblivious-pairing algorithm. 

Finally, by the above reasoning, at the time that Si is compared to Bj, there is no other block 
of size 2^. If the comparison were "equal," then the next comparison would be unusual as well, 
with 1521 < 2*, and no additional blocks Bj would form. We conclude that, for each j < r, the 
comparison between ^i and Bj is "unequal." □ 

Using Lemma 9, we can prove Theorem 8 as follows. 

Proof of Theorem 8. Fix a nonnegative integer s and look at the comparisons the greedy-pairing 
algorithm makes on input X with l^j+il = 2*. Let T be as defined in Lemma 9. 

By claim 1 of Lemma 9, all such comparisons before T are also made by the oblivious-pairing 
algorithm on input X at some point in time. As the greedy-pairing algorithm only involves bits 
of Y in comparisons, these comparisons are actually made by the oblivious-pairing algorithm on 
input Y . 

By claims 2 and 3 of Lemma 9, there are r more comparisons the greedy-pairing algorithm makes 
at and after time T with \Si-\-i\ = 2*. With these comparisons, we can associate the comparisons 
the oblivious-pairing algorithm makes involving the blocks Bi, B2, ■ ■ ■ , Br and their superblocks. 
By claim 3, Bi, B2, ■ ■ ■ , B^-i are subsequent blocks the oblivious-pairing algorithm produces during 
phase s. By claim 4, all of them have the same value. The oblivious-pairing algorithm will spend at 
least r— 1— [log(r — 1)] comparisons on combining the blocks Bi, ... , B^-i- So, the greedy-pairing 




(1) 



k=l 
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algorithm makes at most r — (r — 1 — [log(r — 1)]) < 2 + log queries with |5'j+i| = 2^* which 
we cannot account for by queries the oblivious-pairing algorithm makes on Y. Adding this surplus 
over all values of s we get that 

Cgp{X) < Cop{Y) + (2 + log AT) log AT, 

which establishes the upper bound. □ 

We point out that Alonso et al. [2] showed that the average case complexity of the greedy- 
pairing algorithm is optimal up to an O(logA^) term. In particular, they established the following 
upper bound. 

Theorem 10 (Alonso et al. [2]) The average number of comparisons made by the greedy-pairing 
algorithm over all N-bit inputs equals 

The term ^/8N/9 TT comes from the average discrepancy over all A'^-bit inputs, which is ^2N/Tr + 
0(1). 

However, the analysis by Alonso et al. is not sufficient for our purposes. We need an algorithm 
which performs well on the worst-case input. It is with this goal in mind that we now study a 
randomized version of the greedy-pairing algorithm. 



4.3 The randomized greedy-pairing algorithm 

In Section 4.1, we randomized the oblivious-pairing algorithm by first applying a random permuta- 
tion TT to the input bits. We can use the same technique to randomize the greedy-pairing algorithm. 
This is the algorithm which leads to our main result. 
The following analysis is essential. 

Theorem 11 There exists a constant d such that the following holds. Let Cgp{X) denote the 
number of comparisons the greedy-pairing algorithm makes on input X. Let N > 0, and let A-\-B = 
N, A, B > 0. Let X be chosen uniformly at random from all strings of A ones and B zeros. Then 

for every r > 1, 



Pr 

X 



Cgp{X) >]-N+\ min(A, B) + dVwV 



< 2-MogAr. 



Theorem 11 shows that the worst-case inputs for the randomized greedy-pairing algorithm are 
the balanced ones. We conclude: 



Corollary 12 There exists a constant d such that for any positive e and any binary string X of 
length N , with probability at least 1 — e the randomized greedy-pairing algorithm makes no more 
than I A'' + d^J N log(e^^ log A^) comparisons on input X. 

As with Theorem 6, we will make use of a concentration result in the proof of Theorem 11. 
Here as there, we believe this result is already known, but have not found a reference. A proof is 
included in the Appendix. 
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Lemma 13 There exists a constant d such that the following holds. Let N = A + B > Q, where 
A> B >Q. Let X he chosen uniformly at random from all strings of A ones and B zeros. Let M 
denote the index of the ([Af/2j + l)st one of X. If A = B, let M = N . Then for every r >1, 



Pr 

X 



M 



iV2 

2A 



> dVvN 



< 2' 



Proof of Theorem 11. 

Let Y be the string consisting of the first M bits of X. where, as before, M is the position of 
the {\_N/2\ + l)st bit in X agreeing with the majority. When X is balanced, M = N and Y = X. 
By Theorem 8, 

Cgp{X) < Cop{Y) + 0(log2 N). 

If X is exactly balanced, then Y = X is uniformly distributed among strings having N/2 ones 
and N/2 zeros. In this case, we have reduced to Theorem 6. 

Suppose X is not exactly balanced. Without loss of generality, let A > B. Y has exactly 
i[N/2\ + 1) ones and M - {lN/2\ + 1) zeros, the M'th bit being a one. Let Y' be the string of 
length M — 1 obtained by dropping the last one from Y. 

Conditioned on M being fixed, Y' is uniformly distributed among strings having lN/2\ ones 
and M — 1 — [N/2\ zeros. Hence Theorem 6 applied to Y' yields 

Cop{Y') < M - ^(M - lN/2\) + dV^ < + dWN 

with probability at least 2^*" log N . 

Since Y differs from Y' only in the rightmost bit, oblivious pairing does all the same comparisons 
on Y as on Y' , plus at most one additional comparison per phase. Hence 



Putting this together, 



Cop{Y)<Cop{Y') + \ogN. 



M + N , o 

CcpiX) < — ^— + dVwV + 0(log2 A^) 



with probability at least 1 — 2 log N. Since M < N, this is already enough to establish Corol- 
lary 12. 

By Lemma 13, M < + d'VrN with probability at least 1 — 2~^ . Hence, with probability at 
least 1 - 2-''(l + logiV), 



CopiX) < 



< 



N"^ + 2AN 
6l 

^AN + BN 
OA 

ZAN + 2AB 
6l 

N B , 



+ {d + d')Vr7V + 0(log^ N) 
+ d!'y/rN 
+ d"VrN 



□ 
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Theorem 11 can be strengthened to show that the random variable Cgp{X) is strongly concen- 
trated around a value slightly smaller than — 'vyg omit the precise expression for the 
concentration point, as it is rather cumbersome and not needed for the sequel. 

A simplified version of the proof of Theorem 11 shows that the expected number of comparisons 
the randomized greedy pairing algorithm makes on an A/^-bit input with A ones and B zeros is 
bounded above by 

\n + \ min(^, B) + 2\ogN = \n - \d + 2\ogN, 

where D denotes the discrepancy of the input. This gives us a bound of the form |-/V — 0{-\/N) on 
the average-case cost of the (randomized) grccdy-pairing algorithm. However, the constant hidden 
in the 0{-\fN) term is not as good as that achieved by Alonso et al. [2] in Theorem 10. 
Using the techniques from Section 2.3, Corollary 12 yields our main result. 

Theorem 14 (Main Result) There exists a constant d such that, for any positive integer N and 
any e > 0, there exists a quantum black-box network of cost 

+ d^/N log{€-HogN) 

that computes the majority of N bits with zero-sided error e. 

5 Lower Bounds for Computing MAJORITY Exactly 

Beals et al. [3] establish a lower bound of ^ quantum queries for computing MAJORITY exactly. 
In this section, wc show that any XOR decision tree computing MAJORITY must use at least 
A^ -|- 1 — w{N). Hence, the oblivious-pairing algorithm of Section 3 is optimal. 

We first define a more general model of computation, a decision tree relative to a set of functions. 
We then show that, relative to the collection of all parity functions, the oblivious-pairing algorithm 
is the best possible. 

Recall that the classical decision tree complexity of MAJORITY equals N. 

5.1 Relative decision trees complexity 

A decision tree relative to a class of functions Q is one which is permitted to apply any function 
from ^ to a subset of the input bits (taken in any order) at unit cost. 

Definition 15 (^-decision tree) Let Q = {gi,g2, ■ ■ ■} be a collection of functions where gk is a 
function on bits. A Q -decision tree is a deterministic algorithm for a given input length N 
which can query its input bits Xq, Xjv-i, and which can also perform queries of the form 
5ffc(A'o-(o) ) • • • ) A^(T(Mfc-i))) where o" is a one-to-one function from {0, . . . , Mk — 1} to {0, . . . , A — 1}. 
The cost of a ^-decision tree is the maximum over all A-bit inputs of the total number of queries 
performed on that input, including individual input bits as well as functions gk- 

Definition 16 (^-decision tree complexity) Let / be a function on {0,1}^. The Q-decision 
tree complexity of f, denoted D^{f), is the minimum cost of a ^-decision tree computing /. When 
Q = {g}, we write this simply as D^{f). 
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We will consider two instances, namely Q = {XOR} and G = VATZTTy, where VATZITy 
denotes the collections of all PARITY functions (on any number of bits). 

We trivially have that D'P-^'^'^'^y (^f^ < D^^^{f) for any function /. The discussion in Section 
2.3 shows that there exists a quantum black-box network that computes / exactly with cost at 
most D^O^if). 

The following lemma establishes a limit on how much we can expect VATZITy to help simplify 
the computation of a function /. It is an extension of a result of Rivest and Vuillemin [11] for 
standard decision trees. 

Lemma 17 Let f be a Boolean function on {0,1}^. If D^-^^-^'^^ {f) < d, then 2^"'^ divides 

Proof. Each leaf of the decision tree corresponds to a set of inputs: those inputs for which the 
computation terminates at that leaf. These sets partition {0,1}^; in particular, the accepting 
leaves partition /~^(1). So it suffices to prove that the size of the set corresponding to any leaf is 
divisible by 2^~'^. 

View {0, 1}^ as a vector space of dimension N over GF(2) (with coordinate-wise addition). 

Each parity query or input bit query is of the form: "Is the input in a subspace of codimension 1?" 

(A subspace has codimension c if it has dimension N — c.) If every response is "yes," then the set 

corresponding to the leaf is also a subspace; since at most d questions were asked, this space is of 

codimension at most d. If some response is "no," then the set is an affine subspace. This is either 

empty, or nonempty of codimension at most d. In every case, the size of the set is a multiple of 
2N-d_ □ 



5.2 Lower bound for MAJORITY 

As we have noted, the oblivious-pairing algorithm in Section 3 is an XOR decision tree. Theorem 4 
therefore implies that D^^^ (MAJORITY) < N +l-w{N). We now show that equality actually 
holds. Hence, the oblivious-pairing algorithm is optimal. 

Theorem 18 D^-^^^"^^ (MAJORITY) = D^o^(MAJORITY) =N + 1- w{N). 
Proof. As noted above, we already know that 

£>^°^ (MAJORITY) <N+l-w{N) 
by Theorem 4. Since D^-^^^'^^ (MAJORITY) < IJ^O^ (MAJORITY), it suffices to show that 

i:)^-^^^^^ (MAJORITY) >N + 1- w{N). 

We will use Lemma 17 to do so; the first step is to compute what power of 2 divides |MAJ0RITY~^(1)|. 

We first consider the case where N is even, say N = 2m. The 2^"* possible inputs can be 
divided into three types: those with more I's than O's, those with more O's than I's, and the (^™) 
perfectly balanced inputs. The number of inputs with a majority of I's is therefore 2^™~^ — ^(^^)- 
Since (^™) = {2m)\/{m\f, Corollary 2 states that (^™) is exactly divisible by 2^= for k = (2m - 
w{2m)) - 2(m - w{m)) = w{2m) = w{N). Therefore, since w{N) < N, |MAJ0RITY-^(1)| is 
exactly divisible by 2"'(^)-i. 
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If wc had D^-^^^'^y (MAJORITY) < N - w{N), then, by Lemma 17, we would have 2'^(^) 
dividing |MAJORITY-i(l)|. Since this is false, we must have D^-^^^^^ (MAJORITY) >N + 1- 
w{N) for even A^. 

When N is odd, we note that we can use an algorithm for MAJORITY on N variables to 
solve the problem on AT — 1 variables: pad the A — 1 input bits with one 0. Since the above 
argument for the even case only relies on the number of inputs mapped to 1, we thus conclude that 
Z)^-^^^^^ (MAJORITY) for A odd is at least (AT - 1) + 1 - w{N - 1) = A + 1 - w{N), which 
proves the desired result. □ 



6 Lower Bounds for Computing MAJORITY with Zero-Sided Error 

Beals et al. [3] prove a lower bound of ^ on the number of queries a quantum black-box network 
needs to compute MAJORITY on A^-bit strings with zero-sided error e < 1. We will show that 
the cost of a randomized XOR decision tree computing MAJORITY with zero-sided error e = o(l) 
cannot be reduced below |A — o(A). We will also prove that any classical randomized decision tree 
with zero-sided error e = i has to have cost at least A". In fact, we will show the stronger result 
that any classical randomized decision tree with arbitrary error bounded by | has cost at least A^. 

This result about randomized XOR decision trees follows directly from the average case lower 
bound of Alonso et al. [2] using a standard argument. 

Theorem 19 (Alonso et al. [2]) There exists a constant d such that the following holds for any 
input length N. For any XOR decision tree computing MAJORITY, the average cost over all inputs 
of length N is at least 



Corollary 20 There exists a constant d such that the following holds for any input length N. For 
any randomized XOR decision tree computing MAJORITY exactly, there exists an input of length 



Look at the randomized XOR decision tree T as a distribution over deterministic XOR decision 
trees {%}■ Each deterministic tree % in the support of T computes MAJORITY exactly. By 

Theorem 19, the average cost of each % is at least g{N). Consequently, the expected average cost 
of T is at least g{N). Therefore, there exists an input on which the expected number of queries is 



A randomized XOR decision tree T with zero-sided error e and cost C, can be transformed 
into an exact randomized XOR decision tree T' for the same function with an expected number of 
queries of at most C + e{N — C) < C + eN on any input. We just run T and whenever it is about 
to answer "I don't know," we query individual bits until we know the entire input. Using Corollary 
20, we obtain: 

Theorem 21 Any randomized XOR decision tree computing MAJORITY on N-bit inputs with 
zero-sided error e has cost at least |A^ — eA^ — 0{\/N). 





at least g{N). 



□ 
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In contrast, a classical randomized decision tree needs N queries to compute MAJORITY with 
error e for any sufficiently small constant e. 

Theorem 22 Any randomized decision tree that computes the MAJORITY of N bits with bounded 
error e < | has cost at least N. 

Proof. Let t denote Suppose there exists a randomized decision tree T that computes 

MAJORITY on A^-bit inputs with bounded error e < 5 and cost at most N — 1. Without loss of 
generality, we can assume that T always queries exactly A — 1 of the N input bits. 

First the following observations. Consider a deterministic tree of cost N — 1 and suppose we 
pick an A^-bit input uniformly at random among those with exactly A ones. Then the probability 
that the unique bit not queried is a one equals ^. Also, for any final state s of T, the probability 
that we end up there only depends on the number of ones seen when we reach s. 

Look at T as a probability distribution over deterministic trees of cost N — 1. Among all final 
states that have seen t—1 ones, let a be the weighted fraction that outputs 0. Consider the input 
distribution that is a convex combination of /3 times the uniform distribution over inputs with 
exactly t — 1 ones, and 1 — /3 times the uniform distribution over inputs with exactly t ones. By 
the above observations, the probability of error is at least 

(l-a) + (l-/?)^a. 

Picking /? = makes the factors of (1 — a) and a equal, so we get that the probability of error 
is at least ^^^^p^y-, which exceeds |. This contradicts the assumption that e < |- D 

We note that the bound of | in Theorem 22 is essentially tight. Using the notation from the 
above proof, the following algorithm does the job: Query N — 1 bits in random order and output 
if less than t — 1 of them are one, 1 if more, and the outcome of a (biased) coin toss otherwise. 

Corollary 23 Any randomized decision tree that computes the MAJORITY of N bits with zero- 
sided error e < ^ has cost at least N. 

Proof. Transform the randomized decision tree A with zero-sided error e into the randomized 

decision tree A' as follows: Whenever A says "I don't know," output the outcome of a fair coin 
toss; otherwise answer the same as A. A' has two-sided error e/2. Then apply Theorem 22 to A'. 
□ 

Again, the bound of | is essentially tight. 

7 Open Questions 

We can summarize the known results in a table. We fixed the error e in the table to N~^. 



cost of 
MAJORITY 


Quantum black-box model 


XOR decision tree model 


Lower bound 


Upper bound 


Lower bound 


Upper bound 


exact 


A^/2 [3] 


N - w{N) + 1 


N - w{N) + 1 [12] 


N - w{N) + 1 [12] 


zero-sided 
error 


[3] 


|Ar + o(VAiogAr) 


1 A - 0{VN) [2] 


lN-\-OWN\ogN) 


two-sided 
error 


n{N) [3] 


In + 0{^N log N) [13] 


2^* 


In + OWN log N) 
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This leads to several natural open questions. 

• Our results for exact and zero-sided error in the XOR decision tree model are quite tight. The 
corresponding results in the quantum black-box model are not. Can we narrow the gap? The 
quantum black-box model is more powerful than the XOR decision tree model, so we may be 
able to improve the quantum upper bound by applying some other technique to MAJORITY. 

• On the other hand, we may be able to improve the quantum lower bounds, in particular in 
the two-sided error case. The best lower bound we currently know is ^}{N) for any constant 
error ratio less than i . This follows from Paturi's [10] result that the approximating degree of 
the majority function (see, for example, [9] for a definition) is 0(iV), and the observation by 
Beals et al. that half the approximating degree is a lower bound for the quantum black-box 
complexity in the bounded error setting. The constant hidden in the 0,(N) of Paturi's result 
is much smaller than 1. A constant of 1 would show that Van Dam's approach is essentially 
optimal for MAJORITY. 

• In this paper, we focused on the exact and zero-sided error settings. The results in the 
table for two-sided error XOR decision trees trivially follow from the classical lower bound 
(Theorem 22) and the upper bound in the zero-sided error setting (Corollary 12). Can we 
exploit the two-sided error relaxation? How about the one-sided error setting? 

• The 0{^/N) term in the lower bound for the cost of a zero-sided error randomized XOR 
decision tree comes from the average size of the discrepancy of a random input. It seems likely 
that, if we restrict to balanced inputs, we can improve this lower bound to |7V — 0(log A^). 
Can we do so? 
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Appendix: Tail Laws 

In this section, we establish Lemmas 7 and 13 as applications of Azuma's inequality (see, for 
example, Motwani and Raghavan [8, Section 4.4]). 

A sequence of random variables Yq, Yi, . . . , is called a martingale if E\Yi \ Yq,Yi, . . . , li_i] = 
for every l<i< I. Azuma's inequality is a general tail law for martingales: 

Theorem 24 (Azuma's Inequality) Let Yq, Yi,. . . ,Y(_ he a martingale. If \Yi — < Cj for 

every 1 < i < i, then 

( }? \ 

Pr[|y£-yo| > A] < 2-exp 



for every A > 0. 

If the underlying sample space f2 can be written as a product 



= ^0,, (2) 



1=1 

we can associate a random variable Y with a martingale Yq,Y\, . . . ,Y^ defined by 

Yi{xx,X2, ...,xi) = E{Y\ Xi = xi,X2 = X2, . . . ,Xi = Xi] (3) 

for < i < i, where X = (Xi, X2, ■ ■ ■ , X() denotes the sample. The latter martingale is called the 
Dooh martingale of Y with respect to the decomposition (2). Note that Yq = ^[^] Y(^ = Y . 

Proof of Lemma 7. For simplicity, we assume that A'' is even; the proof works for odd as well. 

Consider the Doob martingale Yq,Y\,... , oiY = c with respect to the decomposition of 
the sample string in pairs, i.e., (2) with = {0, 1}^, 1 <i < I = N/2. 
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Fix j G {1, 2, . . . , N/2}, and xi, X2, . . . Xj £ {0, 1}^. The conditional distribution on the right- 
hand side of (3) for i = j — 1 can be obtained from the one for i = j hj the following transformation: 
swap the bit in position 2j — 1 with a bit in a random position pi, 2j — 1 < pi < N, and swap 
the bit in position 2j with a bit in another random position P2, 2j — 1 < p2 < N. Since the 
transformation affects at most 3 pairs, the value of c can change by no more than 3 units under 
this transformation. In fact, a change in c of 3 units is impossible, as it would require a change in 
the parity of the bits in the 3 pairs involved, which is impossible. It follows that \Yj — < 2. 

Since E[c] = AB/{N - 1), Theorem 24 yields that 

Pr [|c - AB/{N - 1)1 > A] < 2 • exp(-AV4iV), 
from which the bound stated in Lemma 7 follows. □ 



Proof of Lemma 13. We will first establish a concentration result for the auxiliary random variables 
Cfc) ^ ^ k < N , defined as the number of ones among the first k positions of the sample string. 

Consider the Doob martingale oi Y = Ck with respect to the trivial decomposition (2) with 
ni = {0,1}, I <i<i = N. 

Fix j G {1, 2, . . . , N}, and xi,X2, . . . xj € {0, 1}. The conditional distribution on the right-hand 
side of (3) for i = j — 1 can be obtained from the one for i = j hy swapping the bit in position j 
with a bit in a random position in + N}. The swapping process can affect the value of 

Ck by at most one for j < k, and not at all for j > k. It follows that \Yj — Yj-i\ < 1 for j < k; 
Yj = Yj^i for j > k. Note that E[Ck] = kA/N. Theorem 24 yields that 



For any A, if 



M- 



Ar2 
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Pr [\Ck - kA/N\ > A] < 2 • exp(-AV2A;). (4) 
> A, then either 



1. > f = E[Cr^_.] + ^, or 

2A 2A 



2A ' 1A ^ 

By (4), the probability that at least one of the above occurs is at most 
since A/N > 1/2. The bound stated in Lemma 13 follows. □ 
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